智能论文笔记

Global and Local Analysis of Interestingness for Competency-Aware Deep Reinforcement Learning

Pedro Sequeira , Jesse Hostetler , Melinda Gervasio

分类：人工智能 | 机器学习

2022-11-11

In recent years, advances in deep learning have resulted in a plethora of successes in the use of reinforcement learning (RL) to solve complex sequential decision tasks with high-dimensional inputs. However, existing systems lack the necessary mechanisms to provide humans with a holistic view of their competence, presenting an impediment to their adoption, particularly in critical applications where the decisions an agent makes can have significant consequences. Yet, existing RL-based systems are essentially competency-unaware in that they lack the necessary interpretation mechanisms to allow human operators to have an insightful, holistic view of their competency. In this paper, we extend a recently-proposed framework for explainable RL that is based on analyses of "interestingness." Our new framework provides various measures of RL agent competence stemming from interestingness analysis and is applicable to a wide range of RL algorithms. We also propose novel mechanisms for assessing RL agents' competencies that: 1) identify agent behavior patterns and competency-controlling conditions by clustering agent behavior traces solely using interestingness data; and 2) identify the task elements mostly responsible for an agent's behavior, as measured through interestingness, by performing global and local analyses using SHAP values. Overall, our tools provide insights about RL agent competence, both their capabilities and limitations, enabling users to make more informed decisions about interventions, additional training, and other interactions in collaborative human-machine settings.

translated by 谷歌翻译

近年来，在可解释的AI中取得了重大进展，因为了解深度学习模型的需求已成为人们对AI的信任和道德规范的越来越重要的重要性。顺序决策任务的可理解模型是一个特殊的挑战，因为它们不仅需要了解个人预测，而且需要了解与环境动态相互作用的一系列预测。我们提出了一个框架，用于学习顺序决策任务的可理解模型，在该模型中，使用时间逻辑公式对代理策略进行表征。给定一组试剂痕迹，我们首先使用一种捕获频繁的动作模式的新型嵌入方法聚集痕迹。然后，我们搜索逻辑公式，以解释不同簇中的代理策略。我们使用手工制作的专家政策和受过训练的强化学习代理商的痕迹评估了《星际争霸II》（SC2）中战斗场景的框架。我们为SC2环境实现了一个功能提取器，该功能提取器将痕迹作为高级特征的序列，描述了环境状态和代理重播中代理的本地行为。我们进一步设计了一个可视化工具，描述了环境中单元的运动，这有助于了解不同的任务条件如何导致每个跟踪群集中不同的代理行为模式。实验结果表明，我们的框架能够将试剂痕迹分离为不同的行为群体，我们的战略推理方法会产生一致，有意义且易于理解的策略描述。

translated by 谷歌翻译

我们提出了一种新颖的生成方法，用于根据表征剂的行为的结果变量来生成强化学习（RL）剂的看不见和合理的反事实示例。我们的方法使用变异自动编码器来训练潜在空间，该空间共同编码与代理商行为有关的观测和结果变量的信息。反事实是使用该潜在空间中的遍历生成的，通过梯度驱动的更新以及对从示例池中抽出的情况进行的潜在插值生成。其中包括提高生成示例的可能性的更新，从而提高了产生的反事实的合理性。从三个RL环境中的实验中，我们表明这些方法产生的反事实是与纯粹的结果驱动或基于病例的基准相比，它们更合理且与其查询更接近。最后，我们表明，经过联合训练的潜在训练，可以重建输入观察结果和行为结果变量，从而在训练有素的潜在现象中产生更高质量的反事实，仅重建了观察输入。

translated by 谷歌翻译

Unlike tabular data, features in network data are interconnected within a domain-specific graph. Examples of this setting include gene expression overlaid on a protein interaction network (PPI) and user opinions in a social network. Network data is typically high-dimensional (large number of nodes) and often contains outlier snapshot instances and noise. In addition, it is often non-trivial and time-consuming to annotate instances with global labels (e.g., disease or normal). How can we jointly select discriminative subnetworks and representative instances for network data without supervision? We address these challenges within an unsupervised framework for joint subnetwork and instance selection in network data, called UISS, via a convex self-representation objective. Given an unlabeled network dataset, UISS identifies representative instances while ignoring outliers. It outperforms state-of-the-art baselines on both discriminative subnetwork selection and representative instance selection, achieving up to 10% accuracy improvement on all real-world data sets we use for evaluation. When employed for exploratory analysis in RNA-seq network samples from multiple studies it produces interpretable and informative summaries.

translated by 谷歌翻译